Blind MVA Speech Feature Processing on Aurora 2.0
نویسندگان
چکیده
This paper is focused on the MVA (mean subtraction, variance normalization, and ARMA filtering) feature postprocessing scheme for noise-robust automatic speech recognition. MVA has shown great success in the past on the Aurora 2.0 and 3.0 corpora. To test its generality, in this work MVA is blindly applied to many different acoustic feature extraction methods, and is evaluated using the Aurora 2.0 corpus. Specifically, we apply MVA post-processing to feature extraction techniques including: MFCC, LPC, PLP, RASTA, Tandem, Modulation-filtered Spectrogram and Modulation Cross-CorreloGram. We find that while effectiveness depends on the extraction method used, the majority of features benefit significantly from MVA. We conclude with a brief analysis.
منابع مشابه
ARMA Filtering of Speech Features Using Energy Based Weights
In this paper, a robust feature compensation method to deal with the environmental mismatch is proposed. The proposed method applies energy based weights according to the degree of speech presence to the Mean subtraction, Variance normalization, and ARMA filtering (MVA) processing. The weights are further smoothed by the moving average and maximum filters. The proposed feature compensation algo...
متن کاملEmpirical mode decomposition for noise-robust automatic speech recognition
In this paper, a novel technique based on the empirical mode decomposition (EMD) methodology is proposed and examined for the noise-robustness of automatic speech recognition systems. The EMD analysis is a generalization of the Fourier analysis for processing non-linear and non-stationary time functions, in our case, the speech feature sequences. We use the first and second intrinsic mode funct...
متن کاملLow-resource noise-robust feature post-processing on Aurora 2.0
We present a highly effective and extremely simple noiserobust front end based on novel post-processing of standard MFCC features. It performs remarkably well on the Aurora 2.0 noisydigits database without requiring any increase in model complexity. Compared to the Aurora 2.0 baseline system, our technique improves the average word error rate by 45% in the multicondition training case, (matched...
متن کاملNoise-robust speech feature processing with empirical mode decomposition
In this article, a novel technique based on the empirical mode decomposition methodology for processing speech features is proposed and investigated. The empirical mode decomposition generalizes the Fourier analysis. It decomposes a signal as the sum of intrinsic mode functions. In this study, we implement an iterative algorithm to find the intrinsic mode functions for any given signal. We desi...
متن کاملMissing-feature reconstruction for band-limited speech recognition in spoken document retrieval
In spoken document retrieval, it is necessary to support a variety of audio corpora from sources that have a range of conditions (e.g., channels, microphones, noise conditions, recording media, etc.). Varying band-limited speech represents one of the most challenging factors for robust speech recognition. The missing-feature reconstruction method shows the effectiveness in recognition of the sp...
متن کامل